List of Flash News about SWE Bench
| Time | Details |
|---|---|
|
2026-06-17 18:30 |
Stanford AI Lab: DeLM Cuts Agent Orchestration Costs
Stanford AI Lab details DeLM decentralized models delivering 10% SWE-bench gains with Gemini-3 Flash at under half the cost for coding and Q&A tasks. |
|
2026-05-29 03:56 |
Claude Opus 4.8: Tops SWE-Bench Pro at 69.2%
Claude Opus 4.8 hits 69.2% on SWE-Bench Pro for agentic coding lead, adds self-doubt honesty while trailing GPT-5.5 on Terminal-Bench 2.1. |